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ABSTRACT 

In this paper, we propose two algorithms for solving linear 
inverse problems when the observations are corrupted by 
Poisson noise. A proper data fidelity term (log-likelihood) is 
introduced to reflect the Poisson statistics of the noise. On 
the other hand, as a prior, the images to restore are assumed 
to be positive and sparsely represented in a dictionary of 
waveforms. Piecing together the data fidelity and the prior 
terms, the solution to the inverse problem is cast as the min- 
imization of a non-smooth convex functional. We establish 
the well-posedness of the optimization problem, characterize 
the corresponding minimizers, and solve it by means of pri- 
mal and primal-dual proximal splitting algorithms originating 
from the field of non-smooth convex optimization theory. Ex- 
perimental results on deconvolution and comparison to prior 
methods are also reported. 

Index Terms — Inverse Problems, Poisson noise, Duality, 
Proximity operator, Sparsity. 

1. INTRODUCTION 

Linear inverse problems in presence of Poisson noise have at- 
tracted less interest in the literature than their Gaussian coun- 
terpart, presumably because the noise properties are more 
complicated to handle. Such inverse problems have how- 
ever important applications in imaging such as restoration 
(e.g. deconvolution in medical and astronomical imaging), 
or reconstruction (e.g. computerized tomography). For in- 
stance, the well-known Richardson-Lucy has been proposed 
for deconvolution. The RL algorithm, however, amplifies 
noise after a few iterations, which can be avoided by intro- 
ducing regularization. In [fl], the authors presented a Total 
Variation (TV)-regularized RL algorithm, and [2 1 advocated 
a wavelet-regularized RL algorithm. 

In the context of Poisson linear inverse problems using 
sparsity-promoting regularization, a few recent algorithms 
have been proposed. For example, 0] stabilize the noise 
and proposed a family of nested schemes relying upon prox- 
imal splitting algorithms (Forward-Backward and Douglas- 
Rachford) to solve the corresponding optimization problem. 
The work of (4) is in the same vein. However, nested al- 
gorithms are time-consuming since they necessitate to sub- 
iterate. Using the augmented Lagrangian method with the 



alternating method of multipliers algorithm (ADMM), which 
is nothing but the Douglas-Rachford splitting applied to the 
Fenchel-Rockafellar dual problem, [5 1 presented a deconvo- 
lution algorithm with TV and sparsity regularization. This 
scheme however necessitates to solve a least-square problem 
which can be done explicitly only in some cases. 

In this paper, we propose a framework for solving lin- 
ear inverse problems when the observations are corrupted by 
Poisson noise. In order to form the data fidelity term, we take 
the exact Poisson likelihood. As a prior, the images to restore 
are assumed to be positive and sparsely represented in a dic- 
tionary of atoms. The solution to the inverse problem is cast 
as the minimization of a non-smooth convex functional, for 
which we prove well-posedness of the optimization problem, 
characterize the corresponding minimizers, and solve them 
by means of primal and primal-dual proximal splitting al- 
gorithms originating from the realm of non-smooth convex 
optimization theory. Convergence of the algorithms is also 
shown. Experimental results and comparison to other algo- 
rithms on deconvolution are finally conducted. 



Notation and terminology 

Let H a real Hilbert space, here a finite dimensional vector 
subspace of W 1 . We denote by ||.|| the norm associated with 
the inner product in T-L, and I is the identity operator on TL. 
||.|| p ,P> lis the £ p norm, x and a are respectively reordered 
vectors of image samples and transform coefficients. We de- 
note by riC the relative interior of a convex set C. A real- 
valued function / is coercive, if limi| a .|i_ > . +0o / (x) = +oo, 
and is proper if its domain is non-empty dom / = {x £ TL \ 
f(x) < +00} 7^ 0. To(H) is the class of all proper lower 
semicontinuous (lsc) convex functions from TL to (—00, +00]. 
We denote by |||M||| = max^^o t§TT tne spectral norm of the 
linear operator M, and ker(M) := {x E TL : Mi = 0, x 7^ 
0} its kernel. 

Let x E TL be an y/n x y/n image, x can be written as 
the superposition of elementary atoms ip 1 parameterized by 
7 E 2" such that x = X) 7 ei OL 1 ip 1 = \%\ — L, L ^ n. 

We denote by <I> : %' — > TL the dictionary (typically a frame 
of T-L), whose columns are the atoms all normalized to a unit 
^2-norm 



2. PROBLEM STATEMENT 

Consider the image formation model where an input image 
of n pixels x is indirectly observed through the action of a 
bounded linear operator H : fi — > /C, and contaminated by 
Poisson noise. The observed image is then a discrete col- 
lection of counts y = (y[i])i^i^ n which are bounded, i.e. 
y £ loo. Each count y[i] is a realization of an independent 
Poisson random variable with a mean (Hx)j. Formally, this 
writes in a vector form as 



y ~ V(Hx) 



(1) 



The linear inverse problem at hand is to reconstruct x from 
the observed count image y. 

A natural way to attack this problem would be to adopt 
a maximum a posteriori (MAP) bayesian framework with an 
appropriate likelihood function — the distribution of the ob- 
served data y given an original x — reflecting the Poisson 
statistics of the noise. As a prior, the image is supposed to be 
economically (sparsely) represented in a pre-chosen dictio- 
nary <fr as measured by a sparsity-promoting penalty sup- 
posed throughout to be convex but non-smooth, e.g. the t\ 
norm. 

From the probability density function of a Poisson random 
variable, the likelihood writes: 

p(y | g) = n ((^)M)" w ^(-(^)M) . (2) 

Taking the negative log-likelihood, we arrive at the following 
data fidelity term: 



fl :,6rH^/ poiss< „(#, 



(3) 



oisson 



if y[i] = 0, /poisson W]) 



-y[i] log(rj[i]) + rj[i] if T)[i] > 0, 
+oo otherwise, 

n[i] if n[i] e [o,+oo), 

+oo otherwise. 



Our aim is then to solve the following optimization prob- 
lems, under a synthesis-type sparsity prioiQ, 



argmin J (a), 
aew 

J : a i-> fi o H o $(q) + y^B^a) + %c ° *(a) ■ 



(P 7 ,V.) 



The penalty function $:q4 J2i=o ^(^H) ls positive, ad- 
ditive, and chosen to enforce sparsity, 7 > is a regulariza- 
tion parameter and %c is the indicator function of the convex 
set C. In our case, C is the positive orthant since we are fitting 
Poisson intensities, which are positive by nature. 



From the objective in (P 7 .^ I, we get the following 



Proposition 1. 

(i) fi is a convex function and so are fi o H and f± o H o <J>. 

(ii) fi is strictly convex if\/i £ {1, . . . , n}, y[i] 7^ 0. /1 o H o # 
remains strictly convex if& is an orthobasis and ker(H) = 0. 

(Hi) Suppose that (0, +oo)nH([0, +00)) / 0. Then J £ Y {U). 



'Our framework and algorithms extend to an analysis-type prior just as 
well, though we omit this for obvious space limitation reasons. 



2.1. Well-posedness of ( P 7 ,^| 



Let M. be the set of minimizers of problem dP^^l. Suppose 



that ^ is coercive. Thus J is coercive. Therefore, the follow- 
ing holds: 



Proposition 2. 

(i) Existence: {P 7 , v 



has at least one solution, i.e. M ^ 0. 
(ii) Uniqueness: ( |P 7 ^| has a unique solution iffy is strictly con- 
vex, or under (ii) of Proposition^^ 



3. ITERATIVE MINIMIZATION ALGORITHMS 
3.1. Proximal calculus 

We are now ready to describe the proximal splitting algo- 
At the heart of the splitting framework 



rithms to solve ( P 



is the notion of proximity operator. 

Definition 3 (|6 1). Let F £ T (H). Then, for every x £ H, the 
function y H> F(y) + \\x — y || 2 /2 achieves its infimum at a unique 
point denoted by prox F x. The operator prox F : W — > M thus 
defined is the proximity operator of F. 

Then, the proximity operator of the indicator function of a 
convex set is merely its orthogonal projector. One important 
property of this operator is the separability property: 

Lemma 4 (|7j). Let F k £ Tq(H), k £{!,■■■ ,K} and let G : 

{xk)i^k^K i-> J2k Fk(xk). Thenprox G = (prox Ffc )i^ fcs ;if • 

The following result can be proved easily by solving the 
proximal optimization problem in Definition [3] with fx as de- 
fined in (O, see also 151 . 

Lemma 5. Let y be the count map (i.e. the observations), the prox- 
imity operator associated to fi (i.e. the Poisson anti log-likelihood) 
is. 



P ,,x , j _ [ *j^i±VmEEII*m ] . (4) 



We now turn to prox 7 ^ which is given by Lemma [4] and 
the following result: 

Theorem 6 (| 9 1). Suppose that V i: ( i) 4>i is convex even-symmetric, 
non-negative and non-decreasing on R + , and ipi(0) = 0; (ii) ipi is 
twice differentiable onl\ {0}; (Hi) ipi is continuous on R, and ad- 
mits a positive right derivative at zero tpf + (0) — \im h ^ + 5 > 
0. Then, the proximity operator prox 5 ^ . (13) = a(0) has exactly one 
continuous solution decoupled in each coordinate B [i] : 



if W] 
if \ffl 



<^ l + (0) 
> 8A + (0) 



(5) 



Among the most popular penalty functions ipi satisfying 
the above requirements, we have ipi(a[i]) — |a[i]|,V i, 
in which case the associated proximity operator is soft- 
thresholding, denoted ST in the sequel. 



3.2. Splitting on the primal problem 

3.2.1. Splitting for sums of convex functions 

Suppose that the objective to be minimized can be expressed 
as the sum of K functions in ro(K), verifying domain quali- 
fication conditions: 



( K 
argmin F(x) = F k (x) 



(6) 



Proximal splitting methods for solving © are iterative algo- 
rithms which may evaluate the individual proximity operators 
prox ffc , supposed to have an explicit convenient structure, but 
never proximity operators of sums of the . 

Splitting algorithms have an extensive literature since the 
1970's, where the case K = 2 predominates. Usually, split- 
ting algorithms handling K > 2 have either explicitly or 
implicitly relied on reduction of (TlOb to the case K = 2 in 
the product space H . For instance, applying the Douglas- 
Rachford splitting to the reduced form produces Spingarn's 
method, which performs independent proximal steps on each 
Fk, and then computes the next iterate by essentially averag- 
ing the individual proximity operators. The scheme described 
in ifTUl is very similar in spirit to Spingarn's method, with 
some refinements. 



Algorithm 1: Primal scheme for solving ( P 7|1 /,) l 



Parameters: The observed image counts y, the dictionary <J>, 
number of iterations Mter, fJ, > and regularization 
parameter 7 > 0. 
Initialization: 

Mi e {1, 2, 3}, p COli) = (0, 0, 0) T . z = (0, 0, 0) T . 

Main iteration: 

For t = to iVitcr - 1, 

• Data fidelity (LemmaE): £(t,i)[i] = prox M/l/3 0( t ,i)[l]) 

• Sparsity-penalty (Lemma |6): 

C(t,l)[ 2 ] = P rOX M7*/3 CP(t.l) I 2 !)' 

• Positivity constraint : f(t,i)[3] = Pe(p(t,i)[3]). 

• Auxiliary constraints with Li and L2: (Lemma|7]l: 

£(t,2) = 7\er Li (P(t,2) )i £(t,3) = ^kcr L 2 (P(t, 3) )• 

• Average the proximity operators: 

& = (£(t,i) + £(t,2) +€(t,3))/3. 

• Choose 9 t £]0,2[. 

• Update the components: 

Vi G {1, 2, 3}, y>(t+i,i) = P(t,i) + 9t{2(,t -z t - f( t ,i)) 

• Update the coefficients estimate: z f +i = z t + #t(£t — z t) 

End main iteration 

Output: Reconstructed image x* — zjv iter [0]. 



3.2.2. Application to Poisson noise inverse problems 

is amenable to the form ©, by wisely in- 



Problem (P 



i,4> 



troducing auxiliary variables. As (P 7 ^,l involves two linear 



operators (€> and H), we need two of them, that we define 
as x% — and X2 = Hxi. The idea is to get rid of 



the composition of <fr and H. Let the two linear operators 
Li = [I - $] and L 2 = [-H I 0]. Then, the opti- 
mization problem (P 7l ^ I can be equivalently written: 



argmin hfa) + lc(%i) + 7*(a) + (7) 

(il,i2,o)e)ixKxH" v ' 



IkcrLi {Xl, X 2 , a) + lkcr-L 2 {xi,X2,a) 



(8) 



Notice that in our case K = 3 by virtue of separability of the 
proximity operator of G in x\, xi and a; see Lemma|4] 

The proximity operators of F and are easily accessi- 
ble through Lemma [5] and [6] The projector onto the positive 
orthant C is also trivial. It remains now to compute the projec- 
tor on ker Lj, i = 1,2, which by well-known linear algebra 
arguments, is obtained from the projector onto the image of 



Lemma 7. The proximity operator associated to Jker L ; is 

PkerL, = I - L- (Li o L?) _1 Li . (9) 

The inverse in the expression of T-W Li is (I + $ o <j> T ) _1 
can be computed efficiently when $ is a tight frame. Simi- 
larly, for L-2, the inverse writes (I + H o H*) _1 , and its com- 
putation can be done in the domain where H is diagonal; e.g. 
Fourier for convolution. 

Finally, the main steps of our primal scheme are sum- 
marized in Algorithm Q] Its convergence is a corollary of 
HO) [Theorem 3.4]. 

Proposition 8. Let (at)teN be a sequence generated by Algorithm\l\ 
Suppose that Proposition\7\(iii) is verified, and ~}2 teN t (2 — 9 t ) — 
+00. Then (zt)tem converges to a (non-strict) global minimizer of 

3.3. Splitting on the dual: Primal-dual algorithm 



Our problem (P 7 ^ 1 can also be rewritten in the form, 



argmin F o K(a) + 7 1 3/(a) 



(10) 



where now K 



and F : (xi,x 2 ) n- /i(xi) 



'Ho* 

%c{x2)- Again, one may notice that the proximity operator of 
F can be directly computed using the separability in x% and 

x 2 . 

Recently, a primal-dual scheme, which turns to be a pre- 
conditioned version of ADMM, to minimize objectives of the 
form ( [Tol l was proposed in IfTTI . Transposed to our setting, 
this scheme gives the steps summarized in Algorithm^ 

Adapting the arguments of IfTTI . convergence of the se- 
quence (o-t)teN generated by Algorithm|2]is ensured. 



Proposition 9. Suppose that Proposition Q}( Hi) holds. Let £ 

'[> (1 + |||H||| 2 ), choose t > and a such that gtC, < 1, and 
let (at)tgK as defined by Algorithm^ Then, (a)tgN converges to 
a (non-strict) global minimizer jP 7;1 /,} at the rate 0(1/ 1) on the re- 
stricted duality gap. 



3.4. Discussion 

Algorithm[T]and|2]share some similarities, but exhibit also im- 
portant differences. For instance, the primal-dual algorithm 
enjoys a convergence rate that is not known for the primal 
algorithm. Furthermore, the latter necessitates two operator 
inversions that can only be done efficiently for some <1> and 
H, while the former involves only application of these linear 
operators and their adjoints. Consequently, Algorithm |2]can 
virtually handle any inverse problem with a bounded linear 
H. In case where the inverses can be done efficiently, e.g. 
deconvolution with a tight frame, both algorithms have com- 
parable computational burden. In general, if other regular - 
izations/constraints are imposed on the solution, in the form 
of additional proper lsc convex terms that would appear in 
(P 7 i, both algorithms still apply by introducing wisely cho- 





RL-MRS | 2 | 


RL-TV |1| 


StabG 1 3 1 


PIDAL-FS 1 5 1 


Alg.HJ 


Alg.l2| 


MAE 


63.5 


52.8 


43 


43.6 


46 


43.6 


Times 


230s 


4.3s 


311s 


342s 


183s 


154s 



sen auxiliary variables. 



Algorithm 2: Primal-dual scheme for solving ( |P 7! ^| . 

Parameters: The observed image counts y, the dictionary 

number of iterations Ni tcv , proximal steps a > and r > 0, 

and regularization parameter 7 > 0. 

Initialization: 

a — a = £0 = Vo — 0. 

Main iteration: 

For t = to Alitor - 1. 

• Data fidelity (Lemma[5}: 

6+1 = (I-<7prox /l/o .)(£ t /(T + Ho*Q t ). 

• Positivity constraint: 774+1 = (I — oVc)(j]t / 'a + 3>St). 

• Sparsity-penalty (Lemma|6j: 



att+i = prox ig [a t 



(H*Ct+i +vt+i)). 
• Update the coefficients estimate: ctt+i = 2a t +i — at 

End main iteration 

Output: Reconstructed image x* = <&aAr iter . 



4. EXPERIMENTAL RESULTS 

Our algorithms were applied to deconvolution. In all exper- 
iments, ^ was the ^i-norm. Table [T] summarizes the mean 
absolute error (MAE) and the execution times for an astro- 
nomical image, where the dictionary consisted of the wavelet 
transform and the PSF was that of the Hubble telescope. Our 
algorithms were compared to state-of-the-art alternatives in 
the literature. In summary, flexibility of our framework and 
the fact that Poisson noise was handled properly, demonstrate 
the capabilities of our approach, and allow our algorithms to 
compare very favorably with other competitors. The compu- 
tational burden of our approaches is also among the lowest, 
typically faster than the PIDAL algorithm. Fig.[T]displays the 
objective as a function of the iteration number and time (in 
s). We can clearly see that Algorithm 2 converges faster than 
Algorithm 1 . 



Table 1 . MAE and execution times for the deconvolution of the sky image. 





Fig. 1. Objective function in function if iterations (left) and 
times (right). 

5. CONCLUSION 

In this paper, we proposed two provably convergent algo- 
rithms for solving the Poisson inverse problems with a spar- 
sity prior. The primal-dual proximal splitting algorithm seems 
to perform better in terms of convergence speed than the pri- 
mal one. Moreover, its computational burden is lower than 
most comparable of state-of-art methods. 
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